Day 27：Go bridge Benchmark shock：原因拆解與方向

2025 iThome 鐵人賽

DAY 27

佛心分享-SideProject30

Mongory：打造跨語言、高效能的萬用查詢引擎系列第 28 篇

17th鐵人賽

法蘭克

2025-09-25 04:38:53

116 瀏覽

分享至

在 Ruby 版的震盪之後，Go 版也迎來一次「下巴掉下來」的時刻：

純 Go：≈ 2ms（十萬筆簡單條件）
cgo 版 Mongory：≈ 90ms（同條件、同資料）

數字殘酷，但結論清楚：瓶頸不在演算法，而在 Go ↔ C 邊界的呼叫成本。本篇把觀測方法、原因拆解、與改善方向攤開來說

量測設計（與程式碼對齊）

基準程式（節錄）：

size := 100_000
loops := 5
records := genRecords(size)

expectedSimple := countSimpleQuery(records)
expectedComplex := countComplexQuery(records)

bench("Simple query (Plain Go)", loops, func() { _ = countSimpleQuery(records) })

matcherSimple, _ := mongory.NewCMatcher(map[string]any{
  "age": map[string]any{"$gte": 18},
}, nil)
bench("Simple query (Mongory Matcher)", loops, func() {
  cnt := 0
  for i := range records { ok, _ := matcherSimple.Match(records[i]); if ok { cnt++ } }
  if cnt != expectedSimple { panic("count mismatch") }
})

bench("Complex query (Plain Go)", loops, func() { _ = countComplexQuery(records) })

matcherComplex, _ := mongory.NewCMatcher(map[string]any{
  "$or": []any{ map[string]any{"age": map[string]any{"$gte": 18}}, map[string]any{"status": "active"} },
}, nil)
bench("Complex query (Mongory Matcher)", loops, func() {
  cnt := 0
  for i := range records { ok, _ := matcherComplex.Match(records[i]); if ok { cnt++ } }
  if cnt != expectedComplex { panic("count mismatch") }
})

量測原則：

關閉 GC 干擾（暫時）：debug.SetGCPercent(-1)，迴圈前後讀取 runtime.MemStats 觀察趨勢
重複測量與校驗一致性（expect vs actual）
重用 matcher 對齊 Ruby 版經驗，避免重複建構成本干擾

為什麼純 Go 2ms、cgo 90ms？

邊界呼叫成本：
- 每筆資料皆需呼叫 C 一次以上，cgo 帶來固定開銷（狀態切換、棧調整、調度）
- shallow 取值雖已惰性、O(1) 索取，但仍需在多次取值中跨界
反射成本（次要）：
- Go 端 shallow 包裝取值會使用反射，雖然已經壓到最少，但在十萬筆規模下仍可見
設計哲學差異：
- Ruby 版的 bottleneck 在 Ruby 本身運算，Go 版的 bottleneck 反而在 cgo 邊界

結論：當條件與資料結構足夠簡單時，純 Go 直敲記憶體是壓倒性優勢，任何跨界呼叫都會被放大